35 research outputs found

    Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making

    Full text link
    Today, AI is being increasingly used to help human experts make decisions in high-stakes scenarios. In these scenarios, full automation is often undesirable, not only due to the significance of the outcome, but also because human experts can draw on their domain knowledge complementary to the model's to ensure task success. We refer to these scenarios as AI-assisted decision making, where the individual strengths of the human and the AI come together to optimize the joint decision outcome. A key to their success is to appropriately \textit{calibrate} human trust in the AI on a case-by-case basis; knowing when to trust or distrust the AI allows the human expert to appropriately apply their knowledge, improving decision outcomes in cases where the model is likely to perform poorly. This research conducts a case study of AI-assisted decision making in which humans and AI have comparable performance alone, and explores whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI. Specifically, we study the effect of showing confidence score and local explanation for a particular prediction. Through two human experiments, we show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors. We also highlight the problems in using local explanation for AI-assisted decision making scenarios and invite the research community to explore new approaches to explainability for calibrating human trust in AI

    Visualizations for an Explainable Planning Agent

    Full text link
    In this paper, we report on the visualization capabilities of an Explainable AI Planning (XAIP) agent that can support human in the loop decision making. Imposing transparency and explainability requirements on such agents is especially important in order to establish trust and common ground with the end-to-end automated planning system. Visualizing the agent's internal decision-making processes is a crucial step towards achieving this. This may include externalizing the "brain" of the agent -- starting from its sensory inputs, to progressively higher order decisions made by it in order to drive its planning components. We also show how the planner can bootstrap on the latest techniques in explainable planning to cast plan visualization as a plan explanation problem, and thus provide concise model-based visualization of its plans. We demonstrate these functionalities in the context of the automated planning components of a smart assistant in an instrumented meeting space.Comment: PREVIOUSLY Mr. Jones -- Towards a Proactive Smart Room Orchestrator (appeared in AAAI 2017 Fall Symposium on Human-Agent Groups

    Bootstrapping Conversational Agents With Weak Supervision

    Full text link
    Many conversational agents in the market today follow a standard bot development framework which requires training intent classifiers to recognize user input. The need to create a proper set of training examples is often the bottleneck in the development process. In many occasions agent developers have access to historical chat logs that can provide a good quantity as well as coverage of training examples. However, the cost of labeling them with tens to hundreds of intents often prohibits taking full advantage of these chat logs. In this paper, we present a framework called \textit{search, label, and propagate} (SLP) for bootstrapping intents from existing chat logs using weak supervision. The framework reduces hours to days of labeling effort down to minutes of work by using a search engine to find examples, then relies on a data programming approach to automatically expand the labels. We report on a user study that shows positive user feedback for this new approach to build conversational agents, and demonstrates the effectiveness of using data programming for auto-labeling. While the system is developed for training conversational agents, the framework has broader application in significantly reducing labeling effort for training text classifiers.Comment: 6 pages, 3 figures, 1 table, Accepted for publication in IAAI 201

    Adult Social Work and High Risk Domestic Violence Cases

    Get PDF
    Summary This article focuses on adult social work’s response in England to high-risk domestic violence cases and the role of adult social workers in Multi-Agency Risk and Assessment Conferences. (MARACs). The research was undertaken between 2013-2014 and focused on one city in England and involved the research team attending MARACs, Interviews with 20 adult social workers, 24 MARAC attendees, 14 adult service users at time T1 (including follow up interviews after six months, T2), focus groups with IDVAs and Women’s Aid and an interview with a Women’s Aid service user. Findings The findings suggest that although adult social workers accept the need to be involved in domestic violence cases they are uncertain of what their role is and are confused with the need to operate a parallel domestic violence and adult safeguarding approach, which is further, complicated by issues of mental capacity. MARACS are identified as overburdened, under-represented meetings staffed by committed managers. However, they are in danger of becoming managerial processes neglecting the service users they are meant to protect. Applications The article argues for a re-engagement of adult social workers with domestic violence that has increasingly become over identified with child protection. It also raises the issue whether MARACS remain fit for purpose and whether they still represent the best possible response to multi-agency coordination and practice in domestic violence

    The Knee Clinical Assessment Study – CAS(K). A prospective study of knee pain and knee osteoarthritis in the general population: baseline recruitment and retention at 18 months

    Get PDF
    BACKGROUND: Selective non-participation at baseline (due to non-response and non-consent) and loss to follow-up are important concerns for longitudinal observational research. We investigated these matters in the context of baseline recruitment and retention at 18 months of participants for a prospective observational cohort study of knee pain and knee osteoarthritis in the general population. METHODS: Participants were recruited to the Knee Clinical Assessment Study – CAS(K) – by a multi-stage process involving response to two postal questionnaires, consent to further contact and medical record review (optional), and attendance at a research clinic. Follow-up at 18-months was by postal questionnaire. The characteristics of responders/consenters were described for each stage in the recruitment process to identify patterns of selective non-participation and loss to follow-up. The external validity of findings from the clinic attenders was tested by comparing the distribution of WOMAC scores and the association between physical function and obesity with the same parameters measured directly in the target population as whole. RESULTS: 3106 adults aged 50 years and over reporting knee pain in the previous 12 months were identified from the first baseline questionnaire. Of these, 819 consented to further contact, responded to the second questionnaire, and attended the research clinics. 776 were successfully followed up at 18 months. There was evidence of selective non-participation during recruitment (aged 80 years and over, lower socioeconomic group, currently in employment, experiencing anxiety or depression, brief episode of knee pain within the previous year). This did not cause significant bias in either the distribution of WOMAC scores or the association between physical function and obesity. CONCLUSION: Despite recruiting a minority of the target population to the research clinics and some evidence of selective non-participation, this appears not to have resulted in significant bias of cross-sectional estimates. The main effect of non-participation in the current cohort is likely to be a loss of precision in stratum-specific estimates e.g. in those aged 80 years and over. The subgroup of individuals who attended the research clinics and who make up the CAS(K) cohort can be used to accurately estimate parameters in the reference population as a whole. The potential for selection bias, however, remains an important consideration in each subsequent analysis
    corecore